Presentation: Tweet"Scalable Data Science and Deep Learning with H2O"

Time: Tuesday 13:55 - 14:45 / Location: Grand Ballroom C

H2O is fast scalable open-source machine learning and deep learning for Smarter Applications. Using in-memory compression techniques, H2O can handle billions of data rows in-memory — even on small compute clusters. The platform includes interfaces for R, Python, Scala, Java, JS and JSON, along with its interactive graphical Flow interface that make it easier for non-engineers to stitch together complete analytic workflows. H2O was built alongside (and on top of) both Hadoop and Spark clusters and is deployed within minutes. Sparkling Water combines the flexibility of Spark with the speed and accuracy of H2O's Machine Learning solution.

In this talk, we explain H2O's scalable in-memory architecture and design principles and outline the implementation of distributed machine learning algorithms such as Elastic Net, Random Forest, Gradient Boosting and Deep Learning. We will present a broad range of use cases and live demos that include world-record deep learning models, anomaly detection tools and approaches for Kaggle data science competitions. We also demonstrate the applicability of H2O in enterprise environments for real-world customer production use cases. We will cover data ingest, feature engineering, model tuning, model validation and model selection; and how to take models into production. Live demos will be run on distributed systems. By the end of this presentation, you will know how to create your own machine learning models on your data using R, Python (iPython Notebooks) or Flow.

Download slides

Arno Candel, TweetPhD - Chief Architect at H2O.ai

Biography: Arno Candel

Arno is the Chief Architect of H2O, a distributed and scalable open-source machine learning platform. He is also the main author of H2O's Deep Learning. Before joining H2O, Arno was a founding Senior MTS at Skytree where he designed and implemented high-performance machine learning algorithms. He has over a decade of experience in HPC with C++/MPI and had access to the world’s largest supercomputers as a Staff Scientist at SLAC National Accelerator Laboratory where he participated in US DOE scientific computing initiatives and collaborated with CERN on next-generation particle accelerators.

Arno holds a PhD and Masters summa cum laude in Physics from ETH Zurich, Switzerland. He has authored dozens of scientific papers and is a sought-after conference speaker. Arno was named “2014 Big Data All-Star” by Fortune Magazine. Follow him on Twitter: @ArnoCandel.

Twitter: @ArnoCandel

http://fortune.com/2014/08/03/meet-fortunes-2014-big-data-all-stars/